ISO2024 INTRODUCTORY SPATIAL 'OMICS ANALYSIS

  • HYBRID : TORONTO & ZOOM
  • 10TH JULY 2024

**Module 4 : Drawing the boundaries **

Instructor : Shamini Ayyadhury


TOPICS COVERED

  • A. Classical segmentation
  • B. Segmentation-free

A. CLASSICAL SEGMENTATION
In [ ]:
### import the following libraries
import sys
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import scanpy as sc
#from PIL import Image
import os
import warnings
import tifffile as tiff

warnings.filterwarnings('ignore')
sys.path.append('/home/shamini/data/projects/spatial_workshop/')
import pre_processing_fnc as ppf
In [ ]:
### directory & filepaths
data_dir = '/home/shamini/data1/data_orig/data/spatial/xenium/10xGenomics/cell_seg_brain_cancer/'
out = '/home/shamini/data/projects/spatial_workshop/out/module4/'
os.makedirs(out+'module4/figures/', exist_ok=True)

We will load the following files

  1. transcripts_subset
    • This is a smaller subset of a larger file from a human brain cancer sample.
  2. composite_image
    • This is a correponding image file that has been reduced and processed to show the 4 channel markers for cell segmentation staining
  3. cell_boundaries
    • THis file contains the polygon information for cell boundaries

The processing steps that were used to derive the above files can be found in supplementary script 06.

In [ ]:
cell_boundaries = pd.read_csv(out+'cell_boundaries_subset.csv')
transcripts_subset_3g = pd.read_csv(out+'transcripts_subset_3g.csv')
iF_crop = tiff.imread(out+'cropped_image_fluo.tif')


#----------------------------------------------
ppf.get_memory_usage() ### monitor memory usage
Out[ ]:
'Memory usage: 781.73 MB'
In [ ]:
composite_img = ppf.plot_composite_image(iF_crop)

#----------------------------------------------
ppf.get_memory_usage() ### monitor memory usage
Shape of iF_crop: (4, 4705, 4705)
Channel 0 max: 9350, min: 6
Channel 1 max: 10806, min: 0
Channel 2 max: 10972, min: 2
Channel 3 max: 8295, min: 0
Out[ ]:
'Memory usage: 1035.36 MB'

STOP FOR DISCUSSION/LECTURE

In [ ]:
'''
FIGURE 1A - PLOT THE INDIVIDUAL CHANNELS & THE COMPOSITE IMAGE
'''
fig, ax = plt.subplots(1, 5, figsize=(40, 10))
fig.suptitle('A1. Individual Channels & Composite Image', fontsize=30, fontweight='bold', y=0.95, x=0.3)

for i in range(4):
    ax[i].imshow(iF_crop[i,:,:], cmap='gray')
    ax[i].axis('off')
ax[4].imshow(composite_img)



'''
FIGURE 1 - PLOT THE INDIVIDUAL CHANNELS & THE COMPOSITE IMAGE WITH ZOOMED IN VIEW
'''
xlower = 3500
ylower = 3500
xlim = [xlower, xlower+600]
ylim = [ylower, ylower+600]
 
fig, ax = plt.subplots(1, 5, figsize=(40, 10))
fig.suptitle('A2. Individual Channels & Composite Image (Zoomed In)', fontsize=30, fontweight='bold', y=0.95, x=0.35)

for i in range(4):
    ax[i].imshow(iF_crop[i,:,:], cmap='gray')
    ax[i].set_xlim(xlim)
    ax[i].set_ylim(ylim)
    ax[i].axis('off')
ax[4].imshow(composite_img)
ax[4].set_xlim(xlim)
ax[4].set_ylim(ylim)


#----------------------------------------------
ppf.get_memory_usage() ### monitor memory usage
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..1.6224868].
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..1.6224868].
Out[ ]:
'Memory usage: 1903.07 MB'
No description has been provided for this image
No description has been provided for this image

STOP FOR DISCUSSION/LECTURE

  • Participants will now explore the images by altering the xlower and ylower paramters below
  • Look carefully at the cell shapes and surrounding environments and appreciate the difficulty in solving the segmentation problem
In [ ]:
### ---------- Participants to alter the following parameters ----------

xlower = 0
ylower = 2000

### ---------------------------------------------------------------------

xlim = [xlower, xlower+600]
ylim = [ylower, ylower+600]

fig, ax = plt.subplots(1, 5, figsize=(40, 10))
fig.suptitle('A2. Individual Channels & Composite Image (Zoomed In)', fontsize=40, fontweight='bold', y=0.95, x=0.30)

for i in range(4):
    ax[i].imshow(iF_crop[i,:,:], cmap='gray')
    ax[i].set_xlim(xlim)
    ax[i].set_ylim(ylim)
    ax[i].axis('off')
ax[4].imshow(composite_img)
ax[4].set_xlim(xlim)
ax[4].set_ylim(ylim)

fig, ax = plt.subplots(figsize=(15, 15))
fig.suptitle('A3. Composite Image (Zoomed In)', fontsize=20, fontweight='bold', y=0.95, x=0.25)


ax.imshow(composite_img)
ax.set_xlim(xlim)
ax.set_ylim(ylim)


#----------------------------------------------
ppf.get_memory_usage() ### monitor memory usage
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..1.6224868].
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..1.6224868].
Out[ ]:
'Memory usage: 2609.59 MB'
No description has been provided for this image
No description has been provided for this image

STOP FOR DISCUSSION/LECTURE

  1. Now we will complete the process by aligning the the transcripts from 3 genes that are supposed to have mutually exclusive spatial expression.
  • STMN1
  • PTPC
  • ANXA1
  1. And we will overlay the xenium-derived polygons over the image as well
In [ ]:
transcripts_subset_3g
Out[ ]:
transcript_id cell_id overlaps_nucleus feature_name x_location y_location z_location qv fov_name nucleus_distance codeword_index group binary
0 281505041710735 jgibpdce-1 0 STMN1 0.195312 3196.330000 22.405909 40.000000 AA17 0.851599 345 gene_probes assigned
1 281505041605346 UNASSIGNED 0 STMN1 0.773438 3971.490200 23.115683 23.350294 AA17 1.518847 345 gene_probes unassigned
2 282462819330108 jpocnilp-1 0 STMN1 790.425800 0.048828 21.617092 40.000000 Z17 2.247070 345 gene_probes assigned
3 282462819324347 jghcnaik-1 0 STMN1 1.914062 107.087890 22.697530 23.249954 Z17 3.491614 345 gene_probes assigned
4 282462819324349 jgjmlafd-1 1 STMN1 2.453125 842.318360 24.927584 34.959274 Z17 0.000000 345 gene_probes assigned
... ... ... ... ... ... ... ... ... ... ... ... ... ...
34826 281513632282767 jlikmhcm-1 0 PTPRC 4626.957000 3737.761700 24.899092 29.833567 AA19 0.793769 203 gene_probes assigned
34827 281513632283012 jlkkdmkh-1 0 PTPRC 4652.390600 3617.697300 24.077530 40.000000 AA19 0.294710 203 gene_probes assigned
34828 281513632283171 UNASSIGNED 0 PTPRC 4668.508000 4177.179700 26.068360 40.000000 AA19 6.217726 203 gene_probes unassigned
34829 281513632283188 jlkblene-1 1 PTPRC 4671.195300 3854.054700 24.882101 40.000000 AA19 0.000000 203 gene_probes assigned
34830 281513632283389 jlkggjhh-1 0 PTPRC 4692.625000 4013.666000 24.045496 34.383205 AA19 1.334653 203 gene_probes assigned

34831 rows × 13 columns

In [ ]:
from matplotlib.patches import Polygon

### Step 1: Choose a region to zoom in
xlower = 0
ylower = 2000
xlim = [xlower, xlower+600]
ylim = [ylower, ylower+600]


### Step 2: Plot the composite fluorescence image 
fig, ax = plt.subplots(figsize=(21, 21))
ax.imshow(composite_img)
ax.set_xlim(xlim)
ax.set_ylim(ylim)

### Step 3: Plot the polygons
grouped = cell_boundaries.groupby('cell_id')
for cell_id, group in grouped:
    group = pd.concat([group, group[:1]])
    plg = Polygon(group[['vertex_x', 'vertex_y']].values, edgecolor='r', facecolor='none')
    ax.add_patch(plg) 
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)

### Step 4: Plot the transcripts for STMN1, PTPRC and ANXA1
sns.scatterplot(data=transcripts_subset_3g, x='x_location', y='y_location', hue='feature_name', ax=ax, s=42)
ax.legend(loc='upper left', bbox_to_anchor=(1, 0.5), ncol=1, fontsize=10)


#----------------------------------------------
ppf.get_memory_usage() ### monitor memory usage
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..1.6224868].
Out[ ]:
'Memory usage: 1408.80 MB'
No description has been provided for this image

DISCUSS/LECTURE

  1. What are the other image based segmentation that you can try?
  2. What factors do we need to take into account when choosing a segmentation model?
  3. Do you see errors? How do we evaluate them? - Next lecture.

END OF MODULE 4 : CLASSICAL SEGMENTATION
Thank you and see you in the next module where we will try a non-image based method